Convex Clustering via Optimal Mass Transport
نویسندگان
چکیده
We consider approximating distributions within the framework of optimal mass transport and specialize to the problem of clustering data sets. Distances between distributions are measured in the Wasserstein metric. The main problem we consider is that of approximating sample distributions by ones with sparse support. This provides a new viewpoint to clustering. We propose different relaxations of a cardinality function which penalizes the size of the support set. We establish that a certain relaxation provides the tightest convex lower approximation to the cardinality penalty. We compare the performance of alternative relaxations on a numerical study on clustering.
منابع مشابه
Vector-Valued Optimal Mass Transport
In this note, we propose a straightforward notion of transport distance on a graph that is readily computable via a convex optimization reformulation. Similar ideas lead to a Wasserstein distance on vector-valued densities, that allow us to apply optimal mass transport to graphs whose nodes represent vectorial and not just scalar data. We are interested in the application to various communicati...
متن کاملModified Convex Data Clustering Algorithm Based on Alternating Direction Method of Multipliers
Knowing the fact that the main weakness of the most standard methods including k-means and hierarchical data clustering is their sensitivity to initialization and trapping to local minima, this paper proposes a modification of convex data clustering in which there is no need to be peculiar about how to select initial values. Due to properly converting the task of optimization to an equivalent...
متن کاملOptimal transport over nonlinear systems via infinitesimal generators on graphs
We present a set-oriented graph-based framework for continuous-time optimal transport over nonlinear dynamical systems. Our approach allows us to recover provably optimal control laws for steering a given initial distribution in phase space to a final distribution in prescribed finite time for the case of nonlinear control-affine systems. The action of the controlled vector fields is approximat...
متن کاملState tracking of linear ensembles via optimal mass transport
We consider the problems of tracking an ensemble of indistinguishable agents with linear dynamics based only on output measurements. In this setting, the dynamics of the agents can be modeled by distribution flows in the state space and the measurements correspond to distributions in the output space. In this paper we formulate the corresponding state estimation problem using optimal mass trans...
متن کاملOptimal Mass Transport on Metric Graphs
We study an optimal mass transport problem between two equal masses on a metric graph where the cost is given by the distance in the graph. To solve this problem we find a Kantorovich potential as the limit of p−Laplacian type problems in the graph where at the vertices we impose zero total flux boundary conditions. In addition, the approximation procedure allows us to find a transport density ...
متن کامل